title: “Assessing and improving measurement of ESCS in PISA 2018”

subtitle: “Markup week 2 exercise 2”

author:

name: Kirsten van Kessel

orcid: 0009-0009-1265-8516

email:

format:

revealjs:

theme: simple

slide-number: true

logo: "logo.png"

scrollable: true

editor: visual

bibliography: references.bib

tbl-cap-location: bottom


PISA

Motivation

Motivation

Current practises

Equal weights:

  • Pros: Stable over years, simple

  • Cons: Performs worse -> Measurement invariance

Regression imputation:

  • Pros: Simple

  • Cons: Outdated, performs worse

Allignment optimization

$$

F = \sum_p \sum_{g_m<g_n} w_{g_m.g_n} f(\lambda_{pg_m} - \lambda_{pg_n}) + \sum_p \sum_{g_m<g_n} w_{g_m.g_n} f(\tau_{pg_m} - \tau_{pg_n})

$$

in which p is the number of indicators, \(g_m\) and \(g_n\) indicate country m and n for every pair of countries in the data, \(\lambda_{pg_m}\) and \(\lambda_{pg_n}\) indicate the factor loadings of country m and n, and \(\tau_{pg_m}\) and \(\tau_{pg_n}\) indicate the intercepts of country m and n. Within the total loss function \(F\) the component loss function \(f\) scales the difference between the parameters for every pair of countries and for every measurement parameter:

$$

f(x) = \sqrt{\sqrt{x^2 + 0.01}}

$$

\(w_{g_m.g_n}\) is a weight that represents the size of the countries. It is defined as:

$$

w_{g_m.g_n} = \sqrt{N_{g_n} N_{g_m}}

$$

in which \(N_{g_n}\) and \(N_{g_m}\) are the number of participants from country n and m.

Creating HOMEPOS

# drop the items where a score is missing

data_homepos_long_complete <- data_homepos_long %>% drop_na(item_score)



# estimate the parameters

parms <- dexterMML::fit_2pl(data_homepos_long_complete %>% 

                              dplyr::select(person_id, item_id, item_score))

coef(parms)



# calculate theta: the ability estimate

theta <- ability.mml(data_homepos_long_complete, parms, method = "WLE")

PISA’s measure vs recreation

                 | Min | 1st Qu. | Median | Mean | 3rd Qu. | Max. |

|——————|——–|———|——–|——-|———|——|

HOMEPOS PISA | -10.20 | -1.12 | -0.37 | -0.44 | 0.30 | 5.92 |
HOMEPOS computed | -9.55 | -0.67 | 0.03 | -0.01 | 0.69 | 6.99 |
ESCS PISA | -8.17 | -1.01 | -0.19 | -0.28 | 0.59 | 4.21 |
ESCS computed | -4.87 | -0.53 | 0.06 | 0.00 | 0.63 | 3.46 |

PISA’s measure vs recreation

# load library

library(ggplot2, warn.conflicts = FALSE)
## Warning: package 'ggplot2' was built under R version 4.2.2
library(plotly, warn.conflicts = FALSE)
## Warning: package 'plotly' was built under R version 4.2.3
library(tidyr)
## Warning: package 'tidyr' was built under R version 4.2.2
# load data

data <- readRDS("recreation_escs_data.rds")



# drop missings

data_compare <- data %>% drop_na()



# select few countries

data_compare <- data_compare %>% filter(data.CNT == "ALB")



# make plot

HOMEPOS_plot <- ggplot(data_compare, aes(x = data.HOMEPOS, y = data.HOMEPOS_theta)) + geom_point(alpha = 0.05) +

  ggtitle("Comparing computed HOMEPOS to the PISA HOMEPOS") +

  xlab("Computed HOMEPOS") + ylab("PISA HOMEPOS") 

ggplotly(HOMEPOS_plot)

PISA’s measure vs recreation

References